Adjoint-based exact Hessian computation

نویسندگان

چکیده

Abstract We consider a scalar function depending on numerical solution of an initial value problem, and its second-derivative (Hessian) matrix for the value. The need to extract information Hessian or solve linear system having as coefficient arises in many research fields such optimization, Bayesian estimation, uncertainty quantification. From perspective memory efficiency, these tasks often employ Krylov subspace method that does not hold explicitly only requires computing multiplication given vector. One ways obtain approximation Hessian-vector is integrate so-called second-order adjoint numerically. However, error could be significant even if integration sufficiently accurate. This paper presents novel algorithm computes intended exactly efficiently. For this aim, we give new concise derivation show can computed by applying particular system. In discussion, symplectic partitioned Runge–Kutta methods play essential role.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Symmetry for Hessian Computation

Hessian computation in automatic differentiation can be implemented by applying elimination operations on a symmetric computational graph. We can exploit the symmetry by identifying pairs of symmetric operations and performing only one operation from each pair. Symmetry exploitation can potentially halve the number of operations. In this presentation, we describe an elimination algorithm that e...

متن کامل

Fast Exact Multiplication by the Hessian

Just storing the Hessian (the matrix of second derivatives ∂2E ∂wi∂wj of the error E with respect to each pair of weights) of a large neural network is difficult. Since a common use of a large matrix like is to compute its product with various vectors, we derive a technique that directly calculates , where is an arbitrary vector. To calculate , we first define a differential operator f ( ) = (∂...

متن کامل

Computation of the Adjoint Matrix

Abstract. The best method for computing the adjoint matrix of an order n matrix in an arbitrary commutative ring requires O(n log n log log n) operations, provided the complexity of the algorithm for multiplying two matrices is γn + o(n). For a commutative domain – and under the same assumptions – the complexity of the best method is 6γn/(2 − 2) + o(n). In the present work a new method is prese...

متن کامل

Superlinearly convergent exact penalty projected structured Hessian updating schemes for constrained nonlinear least squares: asymptotic analysis

We present a structured algorithm for solving constrained nonlinear least squares problems, and establish its local two-step Q-superlinear convergence. The approach is based on an adaptive structured scheme due to Mahdavi-Amiri and Bartels of the exact penalty method of Coleman and Conn for nonlinearly constrained optimization problems. The structured adaptation also makes use of the ideas of N...

متن کامل

Optimal Multistage Algorithm for Adjoint Computation

We reexamine the work of Stumm and Walther on multistage algorithms for adjoint computation. We provide an optimal algorithm for this problem when there are two levels of checkpoints, in memory and on disk. Previously, optimal algorithms for adjoint computations were known only for a single level of checkpoints with no writing and reading costs; a well-known example is the binomial checkpointin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Bit Numerical Mathematics

سال: 2021

ISSN: ['0006-3835', '1572-9125']

DOI: https://doi.org/10.1007/s10543-020-00833-0